Capacity-Constrained Query Formulation

نویسندگان

  • Matthias Hagen
  • Benno Stein
چکیده

Given a set of keyphrases, we analyze how Web queries with these phrases can be formed that, taken altogether, return a specified number of hits. The use case of this problem is a plagiarism detection system that searches the Web for potentially plagiarized passages in a given suspicious document. For the query formulation problem we develop a heuristic search strategy based on cooccurrence probabilities. Compared to the maximal termset strategy [3], which can be considered as the most sensible non-heuristic baseline, our expected savings are on average 50% when queries for 9 or 10 phrases are to be constructed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Framework for Incremental Query Formulation in Mixed-Initiative Case-Based Reasoning

Query formulation is a primary task in the retrieval phase of the case-based reasoning (CBR) cycle, and incremental variants of this task are a distinguishing characteristic of some mixed-initiative CBR approaches (e.g., those that implement a conversational CBR methodology). However, it has rarely been the focus of analysis, which complicates comparing these approaches. We identify the primiti...

متن کامل

A MIQCP formulation for B-spline constraints

This paper presents a mixed-integer quadratically constrained programming (MIQCP) formulation for B-spline constraints. The formulation can be used to obtain an exact MIQCP reformulation of any spline-constrained optimization problem. This reformulation allows practitioners to use a general-purpose MIQCP solver, instead of a special-purpose spline solver, when solving B-spline constrained probl...

متن کامل

The Unified Service Query Language Technical Report

This report presents the specification of the Unified Service Query Language (USQL), which is aimed at supporting the discovery of heterogeneous Web, Peer-to-Peer, and Grid services. At the conceptual level, USQL establishes an abstract and service type-independent viewpoint of the service properties, which are constrained by service requesters through their search criteria. At the grounding le...

متن کامل

Optimization of Single-Satellite Operational Schedules Towards Enhanced Communication Capacity

There are a growing number of small satellites that are being launched and operated to accomplish novel science and technology missions. Scientists and mission operators are seeking to download large quantities of data from these small satellites, however they are resource-constrained and communicate to capacity-constrained ground networks. Motivated by the need for intelligent operational sche...

متن کامل

A Conic Integer Programming Approach to Constrained Assortment Optimization under the Mixed Multinomial Logit Model

We consider the constrained assortment optimization problem under the mixed multinomial logit model. Even moderately sized instances of this problem are challenging to solve directly using standard mixed-integer linear optimization formulations. This has motivated recent research exploring customized optimization strategies and approximation techniques. In contrast, we develop a novel conic qua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010